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1  statement  of  the  problems  studied 

This  project  has  investigated  novel  design  methodologies  for  low-energy  VLSI  circuits 
and  systems.  Its  primary  focus  has  been  on  energy-recovering  (a.k.a.  adiabatic)  cir¬ 
cuit  architectures.  By  steering  currents  across  low  voltage  drops  and  by  recycling 
undissipated  energy,  these  circuits  can  potentially  achieve  significantly  more  elficient 
operation  than  conventional  digital  drcuits.  Early  investigations  into  adiabatic  cir¬ 
cuits  have  yielded  very  complex  designs  that  are  impractical  for  high-speed  design. 
The  main  objective  of  this  project  has  been  the  discovery  of  extremely  simple  energy¬ 
recovering  circuits  which  can  achieve  very  high  energy  efficiency  at  relatively  high 
operating  frequencies. 

In  addition  to  energy-recovering  circuits,  this  project  has  also  explored  approaches 
for  low-energy  conventional  CMOS  design.  This  task  focused  on  the  development  of 
circuit  optimizations  that  can  be  used  to  automatically  improve  the  energy  efficiency 
of  any  given  design. 

2  Summary  of  the  most  important  findings 

The  main  contribution  of  our  research  was  the  discovery  of  TSEL,  the  first  true  single¬ 
phase  energy-recovering  circuit  family  [7].  TSEL  achieves  very  efficient  operation  in 
comparison  with  conventional  CMOS  and  other  adiabatic  circuit  families.  Due  to  its 
extremely  simple  clocking  requirements,  TSEL  is  ideal  for  high-speed  design.  More¬ 
over,  TSEL  gates  are  straightforward  to  cascade  and  are  thus  particularly  suitable 
for  designing  energy-efficient  datapath  components  with  regular  structures. 

Several  adiabatic  logic  families  have  been  proposed  in  recent  years.  Contrary  to 
TSEL,  they  all  require  at  least  two-phase  clocks  to  control  cascades.  A  scheme  with 
asymptotically  zero  dissipation  that  requires  reversible  computations  was  described 
in  [17].  Two  relatively  simple  energy-recovering  logic  styles  that  use  diodes  to  avoid 
reversibility  were  proposed  in  [6,  11].  Complementary  adiabatic  MOS  (CAMOS)  and 
fully  adiabatic  MOS  (ADMOS)  were  proposed  in  [5]  for  high-speed  design.  Various 
adiabatic  circuit  families  were  introduced  and  evaluated  in  [9,  12,  14,  15,  16],  includ¬ 
ing  2N-2P,  2N-2N2P,  pass-transistor  adiabatic  logic  (PAL),  and  clocked  CMOS  logic 
(CAL).  A  16-bit  microprocessor  that  relies  on  clock-powered  logic  to  reduce  energy 
consumption  hzis  been  described  in  [1,  2]. 

TSEL  is  a  partially  adiabatic  logic  akin  to  2N-2P,  2N-2N2P,  and  CAL.  Power  is 
supplied  to  TSEL  gates  by  a  single-phase  clock.  Cascades  are  composed  of  alternating 
PMOS  and  NMOS-type  gates.  Two  DC  reference  voltages  ensure  high-speed,  high- 
efficiency  operation  and  enable  the  cascading  of  TSEL  gates  in  an  NP  domino  style. 

The  basic  structure  of  PMOS  and  NMOS  gates  in  TSEL  is  shown  in  Figure  1.  The 
PMOS  inverter  in  Figure  1(a)  comprises  a  pair  of  cross-coupled  transistors  [M Pi  and 
MP2),  a  pair  of  current  control  switches  (MP3  and  MP4),  and  two  function  blocks 
{MPs  and  MPe).  The  port  PHI  supplies  the  power  clock  The  port  PREF  supplies 
a  constant  reference  voltage  Vpref-  Inputs  and  outputs  are  dual-rail  encoded.  The 
PMOS  gate  operates  in  two  phases:  discharge  and  evaluate. 
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Figure  1:  (a)  A  PMOS  and  (b)  an  NMOS  inverter  in  TSEL. 


The  inverter  in  Figure  1(b)  shows  the  basic  structure  of  a  TSEL  NMOS  gate.  The 
operation  of  the  NMOS  gate  is  complementary  to  PMOS  and  is  made  up  of  the  two 
phases  charge  and  evaluate. 

TSEL  cascades  are  built  by  stringing  together  alternating  PMOS  and  NMOS 
gates.  The  only  signal  required  for  controlling  a  TSEL  cascade  is  the  power  clock 
$.  A  single  reference  voltage  suffices  to  ensure  correct  operation.  The  speed  and 
energy-efficiency  of  the  cascade  can  be  improved,  however,  by  using  separate  refer¬ 
ence  voltages  VpREF  ^'^d  Vmref  for  I’he  PMOS  and  NMOS  gates,  respectively.  To 
ensure  minimal  dissipation,  these  voltages  must  be  tuned  depending  on  the  operating 
frequency. 

In  the  course  of  this  project,  we  designed,  developed,  and  evaluated  layouts  of 
several  arithmetic  units  in  TSEL,  CMOS  and  several  other  adiabatic  logic  families 
[9,  10,  8].  Our  designs  included  8-bit  multipliers,  8-bit  carry-lookahead  adders,  and 
an  8-point/4-bit  Hadamard  Transform  (HT)  module.  In  layout  simulations  with 
O.S^m  standard  CMOS  process  parameters,  our  TSEL  designs  function  for  operating 
frequencies  in  excess  of  200MHz.  At  80MHz,  they  are  as  energy-efficient  as  any  other 
energy-recovering  alternative  we  considered  and  about  5  times  more  energy-efficient 
than  conventional  CMOS.  At  200MHz,  our  TSEL  designs  are  2  times  more  efficient 
than  any  other  adiabatic  alternative  and  about  4  times  more  energy-efficient  than 
their  conventional  static  CMOS  counterparts. 

In  this  report,  we  highlight  our  pipelined  8-point/ 4-bit  one-dimensional  HT  mod¬ 
ule.  whose  basic  architecture  is  shown  in  Figure  2.  This  module  was  chosen  in  consul¬ 
tation  with  members  of  the  “Low  Energy  Electronics  Design  for  Mobile  Platforms” 
MURI  project  at  the  EECS  Department  of  the  University  of  Michigan.  Hadamard 
Transform  modules  are  essential  components  of  modern  wireless  communication  de¬ 
vices  and  are  responsible  for  a  significant  fraction  of  a  communicator’s  dissipation, 
particularly  at  standby. 

Two  versions  of  the  HT  module  were  developed:  a  TSEL-PN  module  in  which 
the  pipeline  starts  with  PMOS-type  gates,  and  a  TSEL-NP  module  in  which  the 
pipeline  starts  with  NMOS  type  gates.  Each  design  comprised  6,768  transistors  and 
was  developed  in  standard  0.5/im  CMOS  technology.  In  HSPICE  layout  simulations, 
both  designs  functioned  correctly  for  frequencies  exceeding  280MHz  with  a  3.0V  peak 
power-clock  voltage. 
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Figure  2:  Block  diagram  of  pipelined  8-point  Hadamard  Transform  module.  The  port 
PHI  supplies  the  power  clock.  PREF  and  NREF  are  constant  reference  voltage  ports. 
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Figure  3:  (a)  A  pipelined  4-bit  carry-lookahead  adder  with  stages  in  the  order  PNPN. 
(b)  Adder  layout  in  TSEL. 
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Figure  4:  Energy  consumption  of  8-point  HT  modules  vs.  operating  frequency. 

Each  of  our  HT  module  consisted  of  twelve  interconnected  elementary  cells.  Each 
elementary  cell  comprised  two  4-stage  pipelined,  4-bit  carry-lookahead  adders  [4]. 
The  netlist  and  layout  of  each  adder  is  shown  in  Figure  3.  For  driving  purposes,  the 
W/L  ratio  of  the  cross-coupled  latches  in  each  elementary  cell  was  sized  to  3.0/0.6. 
Each  transistor  in  the  evaluation  trees  was  sized  to  0.9/0.6.  To  achieve  comparable 
speeds  for  the  PMOS  and  the  NMOS  gates  in  our  design,  VppEP  was  set  to  a  larger 
value  than  V[y|REF- 

We  compared  the  performance  of  our  TSEL  HT  modules  with  corresponding  de¬ 
signs  in  static  CMOS  and  in  the  adiabatic  families  2N-2P  and  PAL  that  have  a 
cross-coupled  transistor  structure  similar  to  TSEL  and  can  be  implemented  in  stan¬ 
dard  CMOS  technology.  In  our  simulations,  the  power  clock  was  an  ideal  sinusoid 
with  100%  energy  recycling  and  a  peak  voltage  of  3.0V.  All  results  were  obtained 
using  the  distributed  capacitance  and  resistance  extracted  from  the  layouts  of  the 
8-point  HT  modules  and  include  the  dissipation  of  the  internal  clock  networks.  All 
primary  inputs  were  driven  by  typical  gate  outputs. 

Figure  4  gives  the  per-cycle  energy  consumption  of  our  designs  as  a  function  of 
their  operating  frequencies.  For  the  selected  W/L  ratios,  PAL  and  2N-2P  break  down 
after  120MHz  and  200MHz,  respectively,  due  to  the  short  duration  of  their  evaluation 
phases  and  the  PMOS-only  nature  of  their  cross-coupled  transistors.  As  expected, 
the  energy  consumption  of  the  static  CMOS  implementation  does  not  vary  with  the 
operating  frequency.  For  operating  frequencies  above  40MHz,  TSEL  is  less  dissipative 
than  every  other  design.  At  200MHz,  both  TSEL  modules  are  approximately  1.5  times 
more  energy-efficient  than  2N-2P.  In  comparison  with  their  static  CMOS  counterparts, 
both  TSEL  designs  are  4  to  5  times  more  energy-efficient. 

Our  research  activities  over  the  past  four  yeras  extended  beyond  energy-recovering 
design  to  encompass  conventional  CMOS.  In  particular,  we  investigated  architectural- 
level  circuit  transformations  for  reducing  energy  consumption  [13].  The  main  idea 
behind  our  technique  was  to  treat  each  flip-flop  as  a  master-slave  pair  of  level-sensitive 
latches  and  transform  the  original  circuit  architecture  by  relocating  one  of  the  two 
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latches.  We  coined  the  term  fixed-phase  retiming  to  describe  our  optimization.  Fixed- 
phase  retiming  places  latches  on  interconnections  with  high  capacity  and  switching 
activity.  Energy  dissipation  is  thus  reduced  by  shielding  the  high-capacitance  node 
from  any  switching  that  occurs  during  the  opaque  phase  in  the  operation  of  the  level- 
sensitive  latch.  Fixed-phase  retiming  also  reduces  energy  dissipation  by  reducing 
the  number  of  latches  in  the  circuit.  We  have  designed  and  implemented  optimal 
algorithms  that  perform  fixed-phase  retiming  while  maintaining  the  functionality  of 
the  original  circuit.  Our  algorithms  take  into  account  timing  constraints  for  level- 
sensitive  latches  and  reduce  the  power  dissipation  of  the  original  circuits  without 

decreasing  their  operating  frequency.  .  •  r 

In  addition  to  power  optimization  techniques,  we  commenced  the  investigation  oi 
power  macromodeling  techniques  for  sequential  CMOS  circuits  [3].  In  the  next  few 
years,  we  expect  to  expand  our  research  activities  in  this  research  area.  In  particular, 
we  plan  to  investigate  efficient  and  accurate  power  macromodeling  techniques  for 
programmable  systems. 

3  List  of  publications  and  technical  reports 

•  G.  Bernacchia  and  M.  C.  Papaefthymiou.  Analytical  macromodeling  for  high- 
level  power  estimation.  Accepted  for  publication  at  the  1999  lEEE/ACM  In¬ 
ternational  Conference  on  Computer-Aided  Design,  November  1999. 

•  S.  Kim  and  M.  C.  Papaefthymiou.  Low-energy  adder  design  with  a  single¬ 
phase  source-coupled  adiabatic  logic.  Accepted  for  publication  at  PATMOS 
’99,  9th  International  Workshop  on  Power  and  Timing  Modeling,  Optimization 
and  Simulation^  October  1999. 

•  S.  Hong,  S.  Kim,  M.  C.  Papaefthymiou,  and  W.  E.  Stark.  Low  power  parallel 
multiplier  design  for  DSP  applications  through  coefficient  optimization.  Ac¬ 
cepted  for  publication  at  the  12th  IEEE  International  ASIC/SOC  Conference, 
September  1999. 

•  S.  Kim  and  M.  C.  Papaefthymiou.  Single-phase  source-coupled  adiabatic  logic. 
Accepted  for  publication  at  the  International  Symposium  on  Low-Power  Elec- 
ironies  and  Design^  August  1999. 

•  S.  Hong,  S.  Kim,  M.  C.  Papaefthymiou,  and  W.  E.  Stark.  Power-complexity 
analysis  of  pipelined  VLSI  FFT  architectures  for  low  energy  wireless  communi¬ 
cation  applications.  Accepted  for  publication  at  the  4^nd  Midwest  Symposium 
on  Circuits  and  Systems,  August  1999. 

•  S.  Kim  and  M.  C.  Papaefthymiou.  Pipelined  DSP  design  with  a  true  single¬ 
phase  energy-recovering  logic  style.  In  VOLTA’99  IEEE  Alessandro  Volta  In¬ 
ternational  Workshop  on  Low  Power  Design,  March,  1999. 


6 


•  S.  Kim  and  M.  C.  Papaefthymiou.  True  single-phase  energy-recovering  logic 
for  low-power,  high-speed  VLSI.  In  International  Symposium  on  Low-Power 
Electronics  and  Design^  August  1998. 

•  M.  C.  Knapp,  P.  J.  Kindlmann,  and  M.  C.  Papaefthymiou.  “Design  and  Eval¬ 
uation  of  Adiabatic  Arithmetic  Units.”  Analog  Integrated  Circuits  and  Signal 
Processing,  Special  Issue  on  “Analog  Design  Issues  in  Digital  VLSI  Circuits  and 
Systems”,  Vol.  14,  pp.  71-79,  1997. 

•  K.  N.  Lalgudi  and  M.  C.  Papaefthymiou  Fixed-Phase  Retiming  for  Low  Power 
Design.  In  1996  International  Symposium  on  Low  Power  Electronics  and  De¬ 
sign,  August  1996. 

•  M.  C.  Knapp,  P.  J.  Kindlmann,  and  M.  C.  Papaefthymiou.  Implementing 
and  Evaluating  Adiabatic  Arithmetic  Units.  In  1996  IEEE  Custom  Integrated 
Circuits  Conference,  May  1996. 

4  List  of  all  participating  research  personnel 

Prof.  Marios  C.  Papaefthymiou,  PI 

•  Promoted  to  Associate  Professor,  University  of  Michigan,  June  1999. 

•  Technical  Program  Committee  Member,  lEEE/ACM  International  Conference 
on  Computer-Aided  Design,  1996-1998. 

•  Associate  Editor,  IEEE  Transactions  on  Computer-Aided  Design  of  Integrated 
Circuits,  June  1997-present. 

•  Associate  Editor,  IEEE  Transactions  on  Computers,  June  1999-present. 

Suhwan  Kim,  Graduate  Research  Assistant 

•  Achieved  PhD  candidacy.  University  of  Michigan,  December  1998. 

Kumar  Lalgudi,  Graduate  Research  Assistant 

•  Received  PhD  degree  in  Electrical  Engineering,  Yale  University,  January  1997. 

Micah  Knapp,  Graduate  Research  Assistant 

•  Received  MS  degree  in  Electrical  Engineering,  Yale  University,  May  1996. 

5  Report  of  inventions 

“True  Single-Phase  Adiabatic  Logic.”  Patent  application  No.  60/095,380  filed  in  the 

U.S.  Patent  and  Trademark  Office  on  8/5/98. 


References 

[1]  W.  C.  Athas,  N.  Tzartzanis,  L.  Svensson,  L.  Peterson,  H.  Li,  X.  Jiang,  P.  Wang, 
and  W-C.  Liu.  AC-1:  A  clock-powered  microprocessor.  In  Proceedings  of  Inter¬ 
national  Symposium  on  Low-Power  Electronics  and  Design,  pages  18-20,  1997. 

[2]  W.  C.  Athas,  N.  Tzartzanis,  L.  J.  Svensson,  and  L.  Peterson.  A  low-power 
microporcessor  based  on  resonant  energy.  IEEE  Journal  of  Solid-State  Circuits, 
SC-32(11):1693-1701,  November  1997. 

[3]  G.  Bernacchia  and  M.  C.  Papaefthymiou.  Analytical  macromodeling  for  high- 
level  power  estimation.  In  Digest  of  Technical  Papers  of  the  1999  lEEE/ACM 
International  Conference  on  CAD,  November  1999. 

[4]  R.  P.  Brent  and  H.  T.  Kung.  A  regular  layout  for  parallel  adders.  IEEE  Trans¬ 
actions  on  Computers,  C-31(3):260— 264,  March  1982. 

[5]  W.  K.  De  and  J.  D.  Meindl.  Complementary  adiabatic  and  fully  adiabatic  MOS 
logic  families  for  gigascale  integration.  In  Proceedings  of  International  Solid-State 
Circuits  Conference^  1996. 

[6]  A.  G.  Dickinson  and  J.  S.  Denker.  Adiabatic  dynamic  logic.  IEEE  Journal  of 
Solid-State  Circuits,  30(3):31 1-315,  March  1995. 

[7]  S.  Kim  and  M.  C.  Papaefthymiou.  True  single-phase  energy-recovering  logic  for 
low-power,  high-speed  VLSI.  In  1998  International  Symposium  on  Low  Power 
Electronic  Design,  August  1998. 

[8]  S  Kim  and  M.  C.  Papaefthymiou.  Pipelined  DSP  design  with  a  true  single-phase 
energy-recovering  logic  style.  In  Proceedings  of  VOLTA ’99  IEEE  Alessandro 
Volta  Memorial  International  Workshop  on  Low  Power  Design,  pages  135-143, 
March  1999. 

[9]  M.  Knapp,  P.  Kindlmann,  and  M.  C.  Papaefthymiou.  Implementing  and  eval¬ 
uating  adiabatic  arithmetic  units.  In  IEEE  1996  Custom  Integrated  Circuits 
Conference,  May  1996. 

[10]  M.  Knapp,  P.  Kindlmann,  and  M.  C.  Papaefthymiou.  Design  and  evaluation 
of  adiabatic  arithmetic  units.  Analog  Integrated  Circuits  and  Signal  Processing, 
pages  71-79,  1997.  Special  Issue  on  Analog  Design  Issues  in  Digital  VLSI  Circuits 

and  Systems. 

[11]  A.  Kramer,  J.  S.  Denker,  S.  C.  Avery,  A.  G.  Dickinson,  and  T.  R.  Wik.  Adia¬ 
batic  computing  with  the  2N-2N2D  logic  family.  In  IEEE  Symposium  on  VLSI 
Circuits/  Digest  of  Technical  Papers,  pages  25-26,  April  1994. 

[12]  A.  Kramer,  J.  S.  Denker,  B.  Flower,  and  J.  Moroney.  2nd  order  adiabatic  compu¬ 
tation  with  2N-2P  and  2N-2N2P  logic  circuits.  In  1995  International  Symposium 
on  Loxo  Power  Design,  pages  191-196,  1995. 


8 


[13]  K.  N.  Lalgudi  and  M.  C.  Papaefthymiou.  Fixed-phase  retiming  for  low-power  de¬ 
sign.  In  1996  International  Symposium  on  Low  Power  Electronic  Design,  August 

1996. 

[14]  D.  Maksimovic,  V.  G.  Oklobdzija,  B.  Nikolic,  and  K.  W.  Current.  Clocked  CMOS 
adiabatic  logic  with  integrated  single-phase  power-clock  supply:  Experimental 
results.  In  Proceedings  of  International  Symposium  on  Low  Power  Electronics 
and  Design,  pages  323-327,  August  1997. 

[15]  Y.  Moon  and  D.  Jeong.  An  efficient  charge  recovery  logic  circuit.  IEEE  Journal 
of  Solid-State  Circuits,  SC-31  (4):514-522,  April  1996. 

[16]  V.  G.  Oklobdzija  and  D.  Maksimovic.  Pass-transistor  adiabatic  logic  using  single 
power-clock  supply.  IEEE  Transactions  on  Circuits  and  Systems-II:  Analog  and 
Digital  Signal  Processing,  44(10):842-846,  October  1997. 

[17]  S.  Younis  and  T.  Knight.  Practical  implementation  of  charge  recovering  asymp¬ 
totically  zero-power  CMOS.  In  Research  in  Integrated  Systems:  Proceedings  of 
the  1993  Symposium,  March  1993. 


9 


